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ABSTRACT 

We quantify the angular distribution of radio sources in the NVSS by measuring the 
two-point angular correlation function w(9). By careful consideration of the resolution 
of radio galaxies into multiple components, we are able to determine both the galaxy 
angular clustering and the size distribution of giant radio galaxies. The slope of the 
correlation function for radio galaxies agrees with that for other classes of galaxy, 
7 = 1.8, with a 3D correlation length r ~ 6 h _1 Mpc (under certain assumptions). 
Calibration problems in the survey prevent clustering analysis below 5i.4GHz = 10 
mJy About 7 per cent of radio galaxies are resolved by NVSS into multiple com- 
ponents, with a power-law size distribution. Our work calls into question previous 
analyses and interpretations of w(9) from radio surveys. 

Key words: large-scale structure of Universe - galaxies: active - surveys 



Oh 

6 



1 INTRODUCTION 

Describing the large-scale structure of the Universe is of fun- 
damental importance for testing theories of structure forma- 
tion and measuring the cosmological parameters. The angu- 
lar distribution of galaxies, whilst merely a projection of the 
true 3D structure, is useful to quantify: it is easy to assemble 
a large sample of objects and it is possible to de-project the 
angular clustering (in a global statistical manner) to mea- 
sure the 3D clustering. 

Active galactic nuclei (AGN) detected at radio wave- 
lengths are a powerful means of delineating large-scale struc- 
ture. They can be routinely detected over wide areas of the 
sky out to very large redshifts (z ~ 4) and hence describe 
the largest structures (and their evolution); radio emission 
is not sensitive to dust obscuration; accurate large-scale cal- 
ibration should be possible; and the current generation of 
radio surveys such as WENSS, FIRST and NVSS contain 
AGN in very large numbers (~ 10 6 ). 

We quantify the angular distributio n of AGN usi ng the 
two-point angular correlation function ( Peebles 1980| ). It is 
well-known that this method loses much of the clustering 
information: two very different structure morphologies can 
have the same correlation function. However, the statistic 
provides a simple point of contact with prediction, the sta- 
tistical errors are well-understood, it has a relatively sim- 
ple de-projection into 3D, and perhaps most importantly, 
its measurement can reveal observational problems with the 
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survey data. It is an essential first analysis step before the 
application of more powerful techniques is warranted. 

The key difference between angular correlation func- 
tion analyses in the optical (e.g. Maddox, Efstathiou & 
Sutherland 1996) and the radio regimes is in the latter, 
the wide redshift range of radio sources washes out much 
of the clustering signal through the superposition of unre- 
lated redshift slices. Hence an angular clustering signal was 
only marginally detected in datasets such as the 1.4 GHz 
Green Bank survey (Kooiman, Burns & Klypin 1995) and 
the Parkes-MIT-NRAO survey (Loan, Wall & Lahav 1997). 
Deeper radio surveys such as FIRST (Becker, White & 
Helfand 1995) and WENSS ( [Rengelink et al. 1998| ) revealed 
a clearer imprint of structure, qua ntifie d by the correlation 
functi on an alyses of Cress et al. (1996) and Magliocchetti 
et al. ( 1998| ). Here we use the more extensive sky coverage 
and source list of the NVSS ( Condon et al. 1998 ) to test the 



robustness of previous conclusions with a higher signal-to- 
noise ratio. We also reconsider the two critical observational 
effects: radio galaxies being resolved into separate compo- 
nents of radio emission, and calibration problems causing 
apparent source surface density gradients and discontinu- 
ities. 



2 OBSERVATIONAL DATA AND METHODS 

2.1 The angular correlation function 

The angular correlation function w(8) compares the ob- 
served (clustered) distribution to a random (unclustered) 
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distribution of points across the same survey area, by sim- 
ply measuring the fractional increase in the number of close 
pairs separated by angle 8. Specifically, if we let DD(8) 
be the number of unique pairs of galaxies with separations 
8^8 + 59, and RR(9) be the number of random pairs in 
the same separation range, then w(9) can be estimated as 

(i) 



RR 



(2) 



Landy & Szalay (1993) point out that the estimator 

_ DD_ DR ] 

RR RR 

which also involves the cross-pair count DR(9), is superior 
because it has a much smaller variance, given by the 'Pois- 
son' error 

1 + w 



Aw = 



(3) 



assuming the statistical error in the random sets can be 
neglected. This can be achieved by averaging over a large 
number of random sets to obtain DR and RR. 
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Figure 1. Variations in NVSS source density as a function of 
declination for flux thresholds 2 mjy (filled circles) and 10 mjy 
(open circles). The declination range of each array configuration 
is also indicated. The error bar on the number of sources N in a 
bin is V^V- Masked regions are excluded from this measurement. 



2.2 The NVSS 

The 1.4 GHz NVSS (NRAO VLA Sky Survey, Condon et 
al. 1998) covers the sky north of declination —40° (82 per 
cent of the celestial sphere). The source catalogue contains 
1.8 x 10 6 sources and is claimed to be 99 per cent com- 
plete at integrated flux density Si.4GHz = 3.5 mjy. The 
survey was performed with the VLA in D configuration, 
with DnC configuration used for fields at high zenith an- 
gles (8 < -10° , 8 > 78°), and the FWHM of the synthesized 
beam is 45 arcsec. The raw fitted source parameters are 
processed by a publicly-available program NVSSlist, which 
performs the deconvolution and corrects for known biases to 
produce source diameters and integrated flux densities. We 
used NVSSlist version 2.16 for our analysis. 

Before measuring the angular correlation function from 
the survey, we must mask out various regions of the sky; 
identical masks must be used in the random comparison 
catalogues. Firstly, we masked the survey within 5° of the 
Galactic plane to eliminate Galactic sources (such as su- 
pernova remnants). Secondly, by virtue of the fitting algo- 
rithm, nearby bright extended radio galaxies can appear in 
the NVSS as a large number of separate elliptical Gaussians. 
These objects introduce a very large number of spurious 
close pairs. By visual inspection of the survey, we compiled 
a list of 22 masks around such galaxies. Finally, the sidelobes 
of very bright sources may appear as spurious entries in the 
source catalogue. As a precautionary measure, we placed cir- 
cular masks of radius 0.5° around all radio sources brighter 
than 1 Jy (although ultimately this was found to have no 
effect on the measurement). 

2.3 Calibration/surface density problems 

The NVSS suffers from calibration problems at low flux den- 
sities: spurious systematic fluctuations in source surface den- 
sity. As illustrated by Figure |l], declination-dependent varia- 
tions occur at flux densities below 10 mjy, including signifi- 
cant jumps at the declinations at which the array configura- 
tion changes. These stem from the difficulty in compensating 



for the sparse w-coverage of the NVSS (W.Cotton, private 
communication, 2001). 

A varying source density will spuriously enhance the 
measured value of w(9). This is because the number of close 
pairs of galaxies depends on the local surface density (DD oc 
a 2 ), but the number of close pairs in the random distribution 
depends on the global average surface density (RR oc (cf) 2 ). 
Systematic fluctuations mean a 2 > (cf) 2 , so according to 
equation |l], w(9) is increased. 

To quantify this effect, we can show (Blake & Wall, in 
preparation) that on angular scales less than those on which 
a is varying, w(8) is subject to a constant offset S 2 , where 
8 — (a — a) /a is the surface over-density. To estimate the 
magnitude of the offset, take a simple toy model in which a 
survey is divided into two equal areas between which there 
is an e per cent shift in density. A simple calculation shows 
S 2 = e 2 /4. So S 2 ~_2.5 x 10" 3 for NVSS sources above 2 
mjy (e ~ 0.1), and S 2 ~ 10~ 4 above 10 mjy (e ~ 0.02). 



3 MULTIPLE-COMPONENT SOURCES 

The complex morphologies of radio sources mean that a sin- 
gle galaxy can be resolved in a radio survey as two or more 
closely-separated components of radio emission. The median 
angular size of mjy radio sources is several arcseconds (Oort 
1988). Thus NVSS, with its beam-size of 45 arcsec, will leave 
the majority of this population unresolved. However, there 
is a significant tail to the size distribution: many arcminute- 
size radio sources have been discovered (e.g. Lara et al. 2001) 
whose sub-structure will be resolved by NVSS. These mul- 
tiple components will produce spurious clustering at small 
separations. 

Consider a single radio source in the survey. How many 
pair separations would this contribute to a separation bin 
at angular distance 8 of width 881 This bin contains area 
8 A = 2tv8 88 and hence the probability of another source 
falling in it is a g 8 A [l+w g (8)] where a g is the surface density 
of galaxies and w g is the galaxy angular correlation function. 
Suppose another source does fall in the bin. In general, both 
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sources will be observed as a group of radio components: a 
single pair separation is replaced by several pair separations. 
Some of these pair separations will lie within an individual 
source, the rest will be between the different sources. 

Suppose 6 is large enough that there are no pair sep- 
arations of size 9 within individual sources. If there are on 
average n radio components per source, then there will be on 
average (n) 2 pair separations between these sources. As the 
radio components will be roughly symmetrically distributed 
about their host galaxy, the average pair separation between 
the components remains equal to that of the original pair of 
sources, 6. It follows that the expected number of radio com- 
ponent pairs involving our original source in this separation 
bin is o g 5 A [1 + w g (9)] x (n) 2 . Hence over the whole survey, 
the expected number of radio component pairs in this bin is 
(neglecting edge effects) 

1 



DD : 



2^' 



,SA [l + w g (9)](n) 



where we have divided by 2 because all pairs are counted 
twice. If N r and a r are the total number and surface density 
of radio components, then N r = n x N g and ov = n x a g 
and this may be written 



5A[l+w g (9)] 



DD = -N r (Tr 

Generating a random set containing 
number of random pairs in this bin is 



N r components, the 



RR = -N r a r 
2 



5A 



and hence the measured correlation function from the ra- 
dio components (using equation is simply w r — w g . So 
multiple radio components have no effect on the measured 
correlation function at angular separations bigger than indi- 
vidual sources. 

At what separation 9 may we neglect the effect of radio 
sources of size 91 The problem may be clarified by a real- 
istic example. Consider a separation bin at 9 — 3 arcmin 
= 0.05° of width 89 = 0.01°. Putting a = 10 deg -2 , there 
are (t2-k959/2 — 1.6 x 1CP 2 random pairs per original source 
in this bin. But we are trying to measure w(9), which is a 
fractional enhancement w g (0.05°) « 2 x 10 -2 of this number 
of random pairs, which is 3 x 10 -4 pairs per original source. 
Observations (e.g. Lara et al. 2001) indicate that > 1 in 
10 4 radio sources are as large as 3 arcmin. Hence we cannot 
neglect pair separations within individual sources. Thus the 
small number of surplus pairs that determine the value of 
w(9) mean that even a tiny fraction of giant radio sources 
can substantially change our measurement. 

Suppose 9 is small enough that radio component pairs 
originating within individual sources are also important. If 
e is the fraction of sources observed to have multiple com- 
ponents, and f(9) 59 is the fraction of those component sep- 
arations in the range 9 — » 9 + 59, then the total number of 
extra separations in the bin is DD = N g e f(9) 59 and hence 
in this separation regime, 

ef{0) 



Wr{9) 



M + 7= 



(n) 2 



,tx9 



(4) 



The angular scale on which the dominant source of 
NVSS pairs changes from multi-component sources to in- 
dividual sources is 6 arcmin (0.1°), as evidenced by a clear 
break in the measured correlation function (see Figure ^[). 
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Figure 2. Measurement of w(8) for NVSS sources with 
•Si. 4 GHz > 15 nijy. The best-fitting sum of two power-laws is 
overplotted. 




Amplitude a / x 10 J 

Figure 3. Contours of constant \ 2 i 11 the space of the clustering 
parameters (a,b). la and 2a contours are shown (x 2 increasing 
by 2.30 and 6.17 from its minimum) for flux thresholds 50 mjy 
(dotted), 20 mjy (dashed) and 10 mjy (solid). The bold circle 
indicates the best-fitting combination for the 10 mjy threshold. 



4 RESULTS 

4.1 The angular correlation function 

Figure ^displays the measured w(9) for 5i.4GHz > 15 mjy. 
We use the Landy-Szalay estimator (equation |^) with the 
error from equation ^| A good fit to w{9) is a sum of two 
power-laws. One power-law dominates at small angles (9 < 
0.1°) and is a direct indication of the size distribution f(9) 
of giant radio sources. The other power-law dominates at 
large angles (9 > 0.1°) and describes the clustering between 
individual radio sources. We measured w(9) for a variety of 
flux thresholds and fitted a sum of two power-laws to the 
results. Turning first to radio galaxy clustering, we find: 

• The slope of the clustering power-law does not depend 
on flux threshold and is consistent with w(ff) = a9~ a 8 , as 
found for optically-selected galaxies. Our most accurate de- 
termination (for a 10 mjy threshold) is —0.85 ± 0.1. 

• The clustering amplitude is also independent of flux 
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Table 1. Best-fitting amplitudes of a sum of two power-laws, for a range of flux thresholds (8 is measured in degrees). The la ranges of 
the best-fitting parameters and the best-fitting reduced x 2 a- re a l so displayed. N is the number of sources analyzed at each threshold. 



Flux threshold 


N 


Clustering power-law 


la 




Size power-law 


la 




Y 2 

A - r e d 


/ mjy 




(a X lO- 3 )0-°- 8 


range 


(cx lO- 5 )0- 3 - 4 


range 


50 


114,362 


a = 1.25 


0.87 -> 


1.63 


c= 1.52 


1.45 -> 


1.60 


1.55 


30 


192,610 


a = 1.33 


1.10 -♦ 


1.55 


c = 0.94 


0.90 -> 


0.98 


1.41 


20 


281,514 


a = 1.26 


1.11 -> 


1.42 


c = 0.68 


0.65 -» 


0.71 


1.36 


15 


361,644 


a = 1.12 


0.99 


1.24 


c = 0.58 


0.55 


0.60 


1.78 


10 


504,045 


a = 1.04 


0.95 -> 


1.13 


c = 0.42 


0.40 -> 


0.44 


2.30 


5 


838,740 


a = 1.29 


1.23 -> 


1.34 


c = 0.24 


0.23 -4 


0.25 


4.37 



threshold: our most accurate value is a = 1.04 ±0.09 x 10 
(with 8 in degrees). 

• Below Si a GH z ~ 10 mjy, the calibration problems de- 
scribed in section 2.3 become dominant, as evidenced by a 
sudden increase in the x 2 of the best fit, as well as direct 
measurement of the surface density (Figure |l|). 



To view the errors on the clustering parameters w(8) = 
a9~ b , in Figure |§] we plot contours of constant \ 2 m ( a i &) 
parameter space for flux thresholds 50 mjy, 20 mjy and 10 
mjy. As the survey becomes deeper, the increasing number 
of sources enables us to measure the clustering parameters 
more accurately until the calibration problems intervene. 

Considering the size distribution, we find: 

• The slope of the small-angle correlation function is in- 
dependent of flux threshold and has the value —3.4. 

• The amplitude of the small-angle correlation function 
changes with flux threshold (equivalently surface density a) 
as 1/cr, as predicted by equation^. 

In table [l] we list the best-fitting amplitudes for both 
power-laws for the flux thresholds considered, obtained by 
minimizing the x 2 statistic. For purposes of comparison, we 
fix the slopes of the small-angle and large-angle power-laws 
at —3.4 and —0.8 respectively. The fits are performed to an- 
gles 6 > 2 arcmin = 0.033°, safely above the resolution limit 
of the NVSS. We derive la errors on the fitted amplitudes 
by varying each in turn from the best-fitting combination 
and finding when x 2 increases by 1.0 from its minimum. 
The reduced x 2 statistic of the best fit is also indicated. 

Previous correlation function analyses of radio surveys 
(Cress et al. 1996, Magliocchetti et al. 1998) did not consider 
either the gradients in source surface density (present in 
FIRST as well as NVSS, Blake & Wall in preparation) or 
the large angular scales on which multi-component sources 
affect the measured w(6). Conclusions from these analyses 
must be regarded as suspect. 



4.2 Determination of spatial clustering properties 

The angular distribution of galaxies is a projection of the 
true 3D distribution. Not knowing the individual redshifts 
of the NVSS sources, we cannot infer their full spatial distri- 
bution. However, we can infer their spatial clustering proper- 
ties from their ang ular clustering kn owing only their redshift 
distribution N(z) (Loan et al. 1997). This is not surprising: 
clustering properties are global statistical measures of the 
sample, like the redshift distribution. 




Clustering index e 

Figure 4. Determination of the spatial clustering length rg of 
NVSS sources for a range of clustering indices. The solid line 
corresponds to the best-fitting angular clustering amplitude and 
the dashed lines encompass the la range. The vertical lines mark 
the special values of e mentioned in the text. 




Flux— density threshold / mjy 

Figure 5. Determination of the clustering length ro of NVSS 
sources for a range of flux-density thresholds, assuming a cluster- 
ing index e = 7 — 1 = 0.8. 



The spatial correlation function £(r) is usually param- 
eterized by a power-law of slope 7 and 'correlation length' 
ro- To allow for redshift evolution of clustering we introduce 
the 'clustering index' e such that 
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Flux— density threshold / mJy 

Figure 6. The fraction of radio galaxies that are resolved by 
NVSS into multiple components, as a function of flux threshold. 



e = implies constant clustering in proper co-ordinates, e = 
7 — 3 implies constant clustering in co-moving co-ordinates, 
and e = 7 — 1 represents growth of clustering under linear 
theory (Peebles 1980). 



It can be shown (Peebles 1980) that a power- law £(r) 
projects on to the celestial sphere as another power-law 
w(9) = a 9 where the projection involves an integral over 
N(z). Conversely, knowing the amplitude and slope of w(8), 
we can deduce 7 and rp. This is discussed further by C ress & 
Kamionkowski ( fl998| ) and Magliocchetti et al. ( 1998[ ). Our 
measured slope of —0.8 demonstrates that 7 = 1.8; to de- 
duce ro we must also assume a value for e, a form for N(z) 
and a cosmology (we take fi = 1, A = 0). In the current 
absence of observed redshift distributions for complete sam- 
ples of mJy radio galaxies, we use the N(z) predicte d from 
the luminosity-function models of Dunlop & Peacock ( 1990 ). 
With these assumptions, in Figure^ we use our 10 mJy mea- 
surement of w(6) to deduce ro for a range of e. 

In Figure H, we fix e = 7 — 1 and investigate if ro de- 
pends on flux threshold (using the appropriate N(z) for each 
threshold). The result is ro ~ 6 h" 1 Mpc with a marginal 
dependence on flux. A more detailed study of spatial clus- 
tering inferences will be the subject of a future paper. 



4.3 Determination of the size distribution 

Our measurements of the small-angle w(8) indicate that the 
size distribution of giant radio galaxies (6 > 2 arcmin) is a 
power-law, f(0) oc 9~ 2A , at all flux thresholds considered. 
(This differs from the correlation function slope —3.4 by 
virtue of the extra power of 9 in equation ^) . Such a steep 
slope is naturally produced in toy models: if radio sources 
have linear sizes up to a maximum Lq, then the available 
volume for sources with angular sizes > 9 scales as 9~ 3 . 

We can use the measured amplitude of the small-angle 
w(6), in conjunction with equation ^| to determine the frac- 
tion e of radio galaxies that are resolved by NVSS into multi- 
component sources. Figure [] illustrates that e ~ 0.07. 

To compare these numbers with observations, we note 
that Lara et al. (2001) used visual inspection of the NVSS 
to find ~ 80 radio galaxies with sizes greater than 9o — 4 
arcmin and total fluxes Si. 4 ghz > 100 mJy in the region 



<5 > 60°. Integrating our derived size distribution above 9o 
predicts a total of 62 such objects. 



5 CONCLUSIONS 

Using the NVSS radio survey, we have measured the angu- 
lar correlation function w(9) of radio galaxies with unprece- 
dented precision. The results may be summarized as follows: 

(i) The correlation function has two contributions: that 
due to multiple components of the same galaxy, dominant 
at 9 < 0.1°, and that due to clustering between galaxies, 
which dominates at larger angles. A clear break in w(9) is 
evident between these scales. 

(ii) The clustering part has a slope consistent with that 
measured for other classes of galaxy, w(9) oc 9~ 8 . 

(iii) The clustering amplitude corresponds to a spatial 
clustering length ro ~ 6 h" 1 Mpc (under certain assump- 
tions), independent of flux-density threshold. 

(iv) The NVSS suffers from calibration problems that pre- 
vent the measurement of w(9) at flux densities below 10 mJy. 

(v) The size distribution of arcminute radio sources is 
well-described by a power-law with slope —2.4; ~ 7 per cent 
of galaxies are resolved by NVSS into multiple components. 

For the first time, our investigation has untangled the 
imprint of radio galaxy clustering from the other observa- 
tional effects, in particular the resolution of radio galaxies 
into multiple components. As such it opens a new observa- 
tional window for large-scale structure investigations, as well 
as providing a novel means of measuring the size distribution 
of giant radio galaxies. 
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